《Career Essentials in Generative AI by Microsoft and LinkedIn》课程笔记

p-1

一共有6节课，课程内容非常简单、特别是前几章。对AI感兴趣且当前没有多少AI知识的话，建议作为AI入门知识的入门课程进行学习，能容易上手一些大模型应用工具的使用以及掌握机器学习的主要算法机制。（全英文课程，建议开启翻译字幕）

p-3
p-4

课程链接：https://www.linkedin.com/learning/paths/career-essentials-in-generative-ai-by-microsoft-and-linkedin

以下是本人的一些课程知识纪要：

1.GPT-3局限性

Lack of common sense（缺乏常识）
Lack of creativity（缺乏创造力）
No understanding of generated text（不理解生成的文本）
Biased databases（有偏见的数据库）
Danger of normalization of mediocrity with creative writing（有创造性写作的平庸正常化的危险）

2.VAE的用处

变分自编码器（Variational Auto-Encoders，VAE）是深度生成模型的一种形式（GAN也是其中一种），VAE是基于变分贝叶斯推断的生成式网络结构。

Financial fraud（金融欺诈）
Manufacturing flaws（制造缺陷）
Network security breaches（网络安全漏洞）

3.Search Engines（搜索引擎）和Reasoning Engines（推理引擎）的对比

Search Engines	Reasoning Engines
Explore a subject further（进一步探索主题）	Understand and interpret human language（理解和解释人类语言）
Not optimized for deeper questions（没有针对更深入的问题进行优化）	Provide direct relevant responses（提供直接相关的回应）
Don’t truly understand a query（不能真正理解查询）	Maintain context and understand intent（维持上下文并理解意图）

4.Prompt Engineering Resources提示工程资源

OpenAI documentation
ChatGPT Discord server
Prompt Engineering Guide
PromptVine
Learn Prompting
PromptPapers
PromptHub

5.设计AI产品时需要思考的

“What is the highest standard of responsible human behavior？”负责任的人类行为的最高标准是什么？
“What actions best promote fairness and dignity？”我们可以采取哪些行动来最好地促进公平和尊严？
“Have we trained an AI to provide an answer lower than the highest standard of responsible human behavior？”我们有没有可能训练人工智能来提供低于最高标准的答案？
“If yes, what can we do to help an AI answer to be free of human bias?”当这种情况发生时，我们可以做些什么来帮助人类根据算法的建议做出更好的决策？

6.Vilas’ ethical AI framework人工智能道德框架

该框架的三大支柱：

Responsible data practices负责任的数据实践
- What is the source of the training data?
- What has been done to reduce bias in the data?
- How might the data we’re using perpetuate historic bias?
- What opportunities exist to prevent biased decision-making?
Boundaries on safe and appropriate use安全和适当使用的明确界限
- Who is the target population for this tool?
- What are their main goals and incentives?
- What is the most responsible way to achieve these goals?
Robust transparency强大的透明度
- How did the tool arrive at its output?
- What other ways do we have of testing fairness?
- Can decision makers easily understand the input-analysis-output process?
- Have you engaged with a broad range of stakeholders?

7.Ethical Data Organization道德数据组织

Prioritizing privacy
Reducing bias
Promoting transparency

技术团队必须拥有强大的内部道德文化、以及外部监督和问责制，以确保我们做出的决策符合道德规范。

8.Creating a Culture of Ethical Decision-Making创建道德决策文化

Foster ethical communication
Establish ethical training

p-2

Responsible AI Policy and Governance Framework首先确保制定负责任的人工智能政策和治理框架，这是最高管理层关于组织应如何设计和管理人工智能技术的声明。

Additional C-Suite Responsibilities(其他高管职责)

Identity specific metrics
Create regular reporting mechanisms on AI practices
Hire a chief AI ethics officer

Preparing the Board of Directors in Ethical AI(准备道德AI的董事会)

Ensure policies and procedures exist for ethical concerns
Ensure necessary resources and expertise
Ensure alignment with regulatory requirements

9.LISA思考模型

Listen to users before you start.
Involve customers in decisions
Share privacy policies
Audit your work

10.Establishing Communication（建立交流）

Establish various sessions between groups
Develop training programs
Create a cross-functional team
Systematically collect and address user feedback
Engage formally and informally with external stakeholders

11.Becoming an Ethical Leader in AI（成为AI的道德领导）

Incorporate communities into design
Build skills beyond technology
Become a steward of a human-contered future.

12.Artificial Intelligence：A system that shows behavior that could be interpreted as human intelligence

AI定义：一个显示可以解释为人类智能的行为的系统。

13.AI通过数据学习的一大优势是，机器可以随着更多的数据而继续增长。

14.机器学习系统仍然只是识别模式（identifying patterns）

15.Artificial Neural Network人工神经网络

是一种模仿人脑结构（使用人脑等结构来分解海量数据集）的人工智能系统，它是目前最流行的机器学习方法之一。

人工神经网络不是提出问题、而是使用数百甚至数百万个数字表盘（numerical dials），结构：

Input layer
Hidden layers
Output layer

p-5

16.All binary classification uses supervised learning所有二元分类都使用监督式机器学习

17.Data Clusters集群，是指机器使用无监督学习来创建自己的数据集

18.Classifying & Clustering

1
2
3

Supervised learning = Classifying

Unsupervised learning = Clustering

如果使用监督学习那就是在分类，如果使用无监督学习、那就是聚类。

聚类分析的最大优点之一是有更多未标记的数据

19.Reinforcement Learning强化学习

是一种机器学习算法，它们使用奖励作为激励系统寻找新模式

*强化学习入门：基本思想和经典算法

20.Q-Learning

有助于创造更复杂的奖励

*强化学习中的Q-Learning介绍

21.K Nearest Neighbor（KNN）K邻近算法

用于多类分类的一种常见的监督机器学习算法

*机器学习算法之——K最近邻(k-Nearest Neighbor，KNN)分类算法原理讲解

22.Euclidean Distance 欧氏距离

这是一个数学公式、可以帮助查看数据点之间的距离

23.K-Means Clustering K-Means聚类

是一种无监督的机器学习算法

*【机器学习】K-means（非常详细）

24.Regression Analysis回归分析

是一种监督式机器学习算法

*回归算法全解析！一文读懂机器学习中的回归模型

25.Naive Bayes Algorithm朴素贝叶斯

一种流行的机器学习算法

*《机器学习》之朴素贝叶斯原理及代码

26.机器学习集成方法——Bagging（自助法，打包）

是指使用同一机器学习算法的多个版本

*常用的模型集成方法介绍：bagging、boosting 、stacking

27.机器学习集成方法——Stacking堆叠法

是指使用几种不同的机器学习算法，然后将它们堆叠在一起

补充，还有一类即成方法boosting（提升法）

28.Cost Function代价函数

系统用来根据正确答案衡量其答案的数字。损失函数和代价函数是同一个东西

*机器学习中的目标函数、损失函数、代价函数有什么区别？

【笔记】《Career Essentials in Generative AI by Microsoft and LinkedIn》课程笔记